Optimized Register Renaming Scheme for Stack-Based x86 Operations

نویسندگان

  • Xuehai Qian
  • He Huang
  • Zhenzhong Duan
  • Junchao Zhang
  • Nan Yuan
  • Yongbin Zhou
  • Hao Zhang
  • Huimin Cui
  • Dongrui Fan
چکیده

The stack-based floating point unit (FPU) in the x86 architecture limits its floating point (FP) performance. The flat register file can improve FP performance but affect x86 compatibility. This paper presents an optimized two-phase floating point register renaming scheme used in implementing an x86-compliant processor. The two-phase renaming scheme eliminates the implicit dependencies between the consecutive FP instructions and redundant operations. As two applications of the method, the techniques used in the second phase of the scheme can eliminate redundant loads and reduce the mis-speculation ratio of the load-store queue. Moreover, the performance of a binary translation system that translates instructions in x86 to MIPS-like ISA can also be boosted by adding the related architectural supports in this optimized scheme to the architecture.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stack Renaming of the Java Virtual

This study proposes a scheme to map the operand stack of the Java Virtual Machine to hardware registers and evaluates the performance beneets of the proposed scheme. Using the technique of register renaming while mapping the stack to registers , we are able to exploit the inherent parallelism in the instruction stream. The simulation results conducted show an improvement of about 15%-26% for th...

متن کامل

Improving Memory Access Performance Using a Code Coalescing Unit

High clock frequencies combined with deep pipelining employed by many of the state-of-the-art processors have forced cache hit accesses to be multi-cycle operations. For many programs, untolerated load latencies account for a signiicant portion of total execution time. In this paper, we present a mechanism called the Code Coalescing Unit (CCU) that can identify and eliminate at run-time several...

متن کامل

Dynamic Register Renaming Through Virtual-Physical Registers

Register file access time represents one of the critical delays of current microprocessors, and it is expected to become more critical as future processors increase the instruction window size and the issue width. This paper present a novel dynamic register renaming scheme that delays the allocation of physical registers until a late stage in the pipeline. We show that it can provide important ...

متن کامل

Delft-Java Dynamic Translation

This paper describes the DELFT-JAVA processor and the mechanisms required to dynamically translate JVM instructions into DELFT-JAVA instructions. Using a form of hardware register allocation, we transform stack bottlenecks into pipeline dependencies which are later removed using register renaming and interlock collapsing arithmetic units. When combined with superscalar techniques and multiple i...

متن کامل

SMLNJ: Intel x86 back end Compiler Controlled Memory

This note describes the code generation algorithm used for the Intel x86, introduced in version 110.16. The standard Chaitin graph coloring register allocation cannot be used directly for machines with few registers, as all temporaries wind up being spilled, making for a poor allocation[Cha82]. Thus, for the x86, the conceptual model of the architecture has been extended with a set of memory lo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007